35 research outputs found

    Privacy-preserving data sharing infrastructures for medical research: systematization and comparison

    Get PDF
    Background: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. Methods: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. Results: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. Conclusions: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken

    Data integration and analysis for circadian medicine

    Get PDF
    Data integration, data sharing, and standardized analyses are important enablers for data-driven medical research. Circadian medicine is an emerging field with a particularly high need for coordinated and systematic collaboration between researchers from different disciplines. Datasets in circadian medicine are multimodal, ranging from molecular circadian profiles and clinical parameters to physiological measurements and data obtained from (wearable) sensors or reported by patients. Uniquely, data spanning both the time dimension and the spatial dimension (across tissues) are needed to obtain a holistic view of the circadian system. The study of human rhythms in the context of circadian medicine has to confront the heterogeneity of clock properties within and across subjects and our inability to repeatedly obtain relevant biosamples from one subject. This requires informatics solutions for integrating and visualizing relevant data types at various temporal resolutions ranging from milliseconds and seconds to minutes and several hours. Associated challenges range from a lack of standards that can be used to represent all required data in a common interoperable form, to challenges related to data storage, to the need to perform transformations for integrated visualizations, and to privacy issues. The downstream analysis of circadian rhythms requires specialized approaches for the identification, characterization, and discrimination of rhythms. We conclude that circadian medicine research provides an ideal environment for developing innovative methods to address challenges related to the collection, integration, visualization, and analysis of multimodal multidimensional biomedical data.Peer Reviewe

    Enabling Open Science in Medicine Through Data Sharing: An Overview and Assessment of Common Approaches from the European Perspective

    Get PDF
    Open Science involves the sharing of knowledge and data as well as the exchange of research results. This is particularly important in the biomedical field, as it can foster validation studies in response to the replication crisis and improve resource utilisation. Since medical data is particularly privacy sensitive, its processing is subject to strong data protection requirements. Agencies, institutions, and projects in the European Union are still struggling with the establishment of widely accepted mechanisms supporting the sharing of data for Open Science practices. The goal of this paper is to provide an overview of different methods that have been used for this purpose and to discuss their technical properties and legal challenges. Our assessment is based on well-known conceptualizations, such as the Five Safes Framework. The result shows that different approaches provide different trade-offs between the functionalities and the degree of data protection provided, and that there are open legal issues. Current legislative initiatives in the EU, including regulations for the European Health Data Space and the Data Governance Act, have the potential to address some of the resulting uncertainties

    Enhancing reuse of data and biological material in medical research : from FAIR to FAIR-Health

    Get PDF
    The known challenge of underutilization of data and biological material from biorepositories as potential resources formedical research has been the focus of discussion for over a decade. Recently developed guidelines for improved data availability and reusability—entitled FAIR Principles (Findability, Accessibility, Interoperability, and Reusability)—are likely to address only parts of the problem. In this article,we argue that biologicalmaterial and data should be viewed as a unified resource. This approach would facilitate access to complete provenance information, which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for optimization of long-term storage strategies, as demonstrated in the case of biobanks.Wepropose an extension of the FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human material and data. These FAIR-Health principles should then be applied to both the biological material and data. We also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of volume and breadth of medical data generation, as well as the associated need to process the data efficiently.peer-reviewe

    Das Modell der Datentreuhand in der medizinischen Forschung

    Get PDF
    Der Konflikt zwischen Datenschutz und Forschungsfreiheit ist so alt wie das Datenschutzrecht selbst. In vielen Fällen wird Datenschutz als ein Hemmschuh für die Forschung wahrgenommen, gerade auch im Bereich der medizinischen Forschung. Ein aktuell viel diskutierter Lösungsansatz, diesen Konflikt zu entschärfen, ist das Modell der Datentreuhand. Ob und wie ein solches Modell nutzbar gemacht werden kann, um für die medizinische Forschung einen leichteren und großzügigeren Zugang zu personenbezogenen Daten zu eröffnen, soll im Folgenden erörtert werden

    SCOR: A secure international informatics infrastructure to investigate COVID-19

    Get PDF
    Global pandemics call for large and diverse healthcare data to study various risk factors, treatment options, and disease progression patterns. Despite the enormous efforts of many large data consortium initiatives, scientific community still lacks a secure and privacy-preserving infrastructure to support auditable data sharing and facilitate automated and legally compliant federated analysis on an international scale. Existing health informatics systems do not incorporate the latest progress in modern security and federated machine learning algorithms, which are poised to offer solutions. An international group of passionate researchers came together with a joint mission to solve the problem with our finest models and tools. The SCOR Consortium has developed a ready-to-deploy secure infrastructure using world-class privacy and security technologies to reconcile the privacy/utility conflicts. We hope our effort will make a change and accelerate research in future pandemics with broad and diverse samples on an international scale
    corecore